Self-Tuning Parameters for Decision Tree Algorithm Based on Big Data Analytics

نویسندگان

چکیده

Big data is usually unstructured, and many applications require the analysis in real-time. Decision tree (DT) algorithm widely used to analyze big data. Selecting optimal depth of DT time-consuming process as it requires iterations. In this paper, we have designed a modified version (DT). The aims achieve by self-tuning running parameters improving accuracy. efficiency was verified using two datasets (airport fire datasets). airport dataset has 500000 instances 600000 instances. A comparison been made between standard with results showing that performs better. This conducted on multi-node Apache Spark tool Amazon web services. Resulting accuracy an increase 6.85% for first 8.85% dataset. conclusion, showed better handling different-sized compared algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Starfish: A Self-tuning System for Big Data Analytics

Modern industrial, government, and academic organizations are collecting massive amounts of data (“big data”) at an unprecedented scale and pace. The ability to perform timely and costeffective analytical processing of such large datasets to extract deep insights is now a key ingredient for success. These insights can drive automated processes for advertisement placement, improve customer relat...

متن کامل

A Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection

Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....

متن کامل

Using 'Big Data' for analytics and decision support

People and the computers they use are generating large amounts of varied data. The phenomenon of capturing and trying to use all of the semi-structured and unstructured data has been called by vendors and bloggers "Big Data". Organizations can capture and store data of many types from almost any source, but capturing and storing data only adds value when it has a useful purpose. Big Data must b...

متن کامل

A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining

Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...

متن کامل

Application of Big Data Analytics in Power Distribution Network

Smart grid enhances optimization in generation, distribution and consumption of the electricity by integrating information and communication technologies into the grid. Today, utilities are moving towards smart grid applications, most common one being deployment of smart meters in advanced metering infrastructure, and the first technical challenge they face is the huge volume of data generated ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computers, materials & continua

سال: 2023

ISSN: ['1546-2218', '1546-2226']

DOI: https://doi.org/10.32604/cmc.2023.034078